images.jpg

                                            Netflix Data from kaggle Website

1.CSV file import and Library Import¶

In [1]:
import pandas as pd  #importing pandas module 
import numpy as np
import matplotlib.pyplot as plt

data = pd.read_csv(r"D:\file (1).csv") #using r for avoid unicode error 
data.head(2) 
Out[1]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil August 14, 2020 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ...
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico December 23, 2016 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit...
In [2]:
data.columns
Out[2]:
Index(['Show_Id', 'Category', 'Title', 'Director', 'Cast', 'Country',
       'Release_Date', 'Rating', 'Duration', 'Type', 'Description'],
      dtype='object')
In [3]:
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7789 entries, 0 to 7788
Data columns (total 11 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   Show_Id       7789 non-null   object
 1   Category      7789 non-null   object
 2   Title         7789 non-null   object
 3   Director      5401 non-null   object
 4   Cast          7071 non-null   object
 5   Country       7282 non-null   object
 6   Release_Date  7779 non-null   object
 7   Rating        7782 non-null   object
 8   Duration      7789 non-null   object
 9   Type          7789 non-null   object
 10  Description   7789 non-null   object
dtypes: object(11)
memory usage: 669.5+ KB
In [4]:
data.dtypes
Out[4]:
Show_Id         object
Category        object
Title           object
Director        object
Cast            object
Country         object
Release_Date    object
Rating          object
Duration        object
Type            object
Description     object
dtype: object

2.Checking duplication¶

In [5]:
data[data.duplicated()]
Out[5]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description
6300 s684 Movie Backfire Dave Patten Black Deniro, Byron "Squally" Vinson, Dominic ... United States April 5, 2019 TV-MA 97 min Dramas, Independent Movies, Thrillers When two would-be robbers accidentally kill a ...
6622 s6621 Movie The Lost Okoroshi Abba T. Makama Seun Ajayi, Judith Audu, Tope Tedela, Ifu Enna... Nigeria September 4, 2020 TV-MA 94 min Comedies, Dramas, Independent Movies A disillusioned security guard transforms into...
In [6]:
data.drop_duplicates(inplace = True)  ## used inplace=true because of changes it permanently
In [7]:
data[data.duplicated()] ## checking that is there any duplicate value left
Out[7]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description
In [8]:
data.shape
Out[8]:
(7787, 11)
In [9]:
data.head(2)
Out[9]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil August 14, 2020 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ...
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico December 23, 2016 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit...
In [10]:
data.isnull().sum
Out[10]:
<bound method NDFrame._add_numeric_operations.<locals>.sum of       Show_Id  Category  Title  Director   Cast  Country  Release_Date  \
0       False     False  False      True  False    False         False   
1       False     False  False     False  False    False         False   
2       False     False  False     False  False    False         False   
3       False     False  False     False  False    False         False   
4       False     False  False     False  False    False         False   
...       ...       ...    ...       ...    ...      ...           ...   
7784    False     False  False     False  False    False         False   
7785    False     False  False     False  False    False         False   
7786    False     False  False      True  False     True         False   
7787    False     False  False      True  False    False         False   
7788    False     False  False     False   True    False         False   

      Rating  Duration   Type  Description  
0      False     False  False        False  
1      False     False  False        False  
2      False     False  False        False  
3      False     False  False        False  
4      False     False  False        False  
...      ...       ...    ...          ...  
7784   False     False  False        False  
7785   False     False  False        False  
7786   False     False  False        False  
7787   False     False  False        False  
7788   False     False  False        False  

[7787 rows x 11 columns]>

3.Imporing seaborn library and checking for null value¶

In [11]:
import seaborn as sns
sns.heatmap(data.isnull())
Out[11]:
<AxesSubplot:>
In [12]:
data.head(2)
Out[12]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil August 14, 2020 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ...
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico December 23, 2016 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit...
In [13]:
data[data['Title'].isin(['House of Cards'])] ## is there any title match
Out[13]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description
2832 s2833 TV Show House of Cards Robin Wright, David Fincher, Gerald McRaney, J... Kevin Spacey, Robin Wright, Kate Mara, Corey S... United States November 2, 2018 TV-MA 6 Seasons TV Dramas, TV Thrillers A ruthless politician will stop at nothing to ...
In [14]:
data.head(2)
Out[14]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil August 14, 2020 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ...
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico December 23, 2016 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit...
In [15]:
data.dtypes
Out[15]:
Show_Id         object
Category        object
Title           object
Director        object
Cast            object
Country         object
Release_Date    object
Rating          object
Duration        object
Type            object
Description     object
dtype: object

4.changing the data type from object to date¶

In [16]:
data['Date_N']= pd.to_datetime(data['Release_Date'])   
data.head(2)
Out[16]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil August 14, 2020 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico December 23, 2016 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23

5.First and last release date¶

In [17]:
data["Release_Date"] = pd.to_datetime(data['Release_Date'])

data.Release_Date.min(),data.Release_Date.max()
Out[17]:
(Timestamp('2008-01-01 00:00:00'), Timestamp('2021-01-16 00:00:00'))

6.Finding the Top 7 year where most Movies and tv series released¶

In [18]:
data['Date_N'].dt.year.value_counts().head(7) 
Out[18]:
2019.0    2153
2020.0    2009
2018.0    1685
2017.0    1225
2016.0     443
2021.0     117
2015.0      88
Name: Date_N, dtype: int64
In [19]:
plt.figure(figsize=(10,5))
sns.histplot(data['Date_N'])
plt.title('distribution by released year')
Out[19]:
Text(0.5, 1.0, 'distribution by released year')
In [20]:
data.head(2)
Out[20]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23
In [21]:
data.groupby('Category').Category.value_counts()
Out[21]:
Category  Category
Movie     Movie       5377
TV Show   TV Show     2410
Name: Category, dtype: int64
In [22]:
fig1=sns.countplot(data['Category'])
for p in fig1.patches:
    fig1.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
                ha = 'center', va = 'center', xytext = (0, 9), textcoords = 'offset points')   ## for showing value in the chart 
C:\Users\hp\anaconda3\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variable as a keyword arg: x. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
  warnings.warn(

Movie Was relesed more than TV Show

7.Finding Some Basic Question¶

In [23]:
data['Year'] = data['Date_N'].dt.year ## getting new column where we got only year 
In [24]:
data.head(2)
Out[24]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0
In [25]:
data [ (data['Category']== 'Movie') &  (data['Year']==2020) ] ## finding some basic question ans 
Out[25]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
4 s5 Movie 21 Robert Luketic Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar... United States 2020-01-01 PG-13 123 min Dramas A brilliant group of students become card-coun... 2020-01-01 2020.0
6 s7 Movie 122 Yasir Al Yasiri Amina Khalil, Ahmed Dawood, Tarek Lotfy, Ahmed... Egypt 2020-06-01 TV-MA 95 min Horror Movies, International Movies After an awful accident, a couple admitted to ... 2020-06-01 2020.0
14 s15 Movie 3022 John Suits Omar Epps, Kate Walsh, Miranda Cosgrove, Angus... United States 2020-03-19 R 91 min Independent Movies, Sci-Fi & Fantasy, Thrillers Stranded when the Earth is suddenly destroyed ... 2020-03-19 2020.0
27 s28 Movie #Alive Cho Il Yoo Ah-in, Park Shin-hye South Korea 2020-09-08 TV-MA 99 min Horror Movies, International Movies, Thrillers As a grisly virus rampages a city, a lone man ... 2020-09-08 2020.0
28 s29 Movie #AnneFrank - Parallel Stories Sabina Fedeli, Anna Migotto Helen Mirren, Gengher Gatti Italy 2020-07-01 TV-14 95 min Documentaries, International Movies Through her diary, Anne Frank's story is retol... 2020-07-01 2020.0
... ... ... ... ... ... ... ... ... ... ... ... ... ...
7762 s7761 Movie Zaki Chan Wael Ihsan Ahmed Helmy, Yasmin Abdulaziz, Hassan Hosny, H... Egypt 2020-05-19 TV-PG 109 min Comedies, International Movies, Romantic Movies An unqualified young man has his work cut out ... 2020-05-19 2020.0
7783 s7782 Movie Zoom Peter Hewitt Tim Allen, Courteney Cox, Chevy Chase, Kate Ma... United States 2020-01-11 PG 88 min Children & Family Movies, Comedies Dragged from civilian life, a former superhero... 2020-01-11 2020.0
7784 s7783 Movie Zozo Josef Fares Imad Creidi, Antoinette Turk, Elias Gergi, Car... Sweden, Czech Republic, United Kingdom, Denmar... 2020-10-19 TV-MA 99 min Dramas, International Movies When Lebanon's Civil War deprives Zozo of his ... 2020-10-19 2020.0
7786 s7785 Movie Zulu Man in Japan NaN Nasty C NaN 2020-09-25 TV-MA 44 min Documentaries, International Movies, Music & M... In this documentary, South African rapper Nast... 2020-09-25 2020.0
7788 s7787 Movie ZZ TOP: THAT LITTLE OL' BAND FROM TEXAS Sam Dunn NaN United Kingdom, Canada, United States 2020-03-01 TV-MA 90 min Documentaries, Music & Musicals This documentary delves into the mystique behi... 2020-03-01 2020.0

1312 rows × 13 columns

In [26]:
data [ (data['Category']== 'TV Show') &  (data['Country']=='India') ] ['Title']  ## for showing only title we use ['title']
Out[26]:
86            21 Sarfarosh: Saragarhi 1897
132                              7 (Seven)
340                           Agent Raghav
364                           Akbar Birbal
533                    Anjaan: Rural Myths
                       ...                
6249                  The Creative Indians
6400    The Golden Years with Javed Akhtar
6469                The House That Made Me
7294                            Typewriter
7705                       Yeh Meri Family
Name: Title, Length: 71, dtype: object

8.Visualizing Top Five director in Netflix¶

In [27]:
i1=data['Director'].value_counts().head(5).plot(kind='bar',title = 'Top Five Director in Netflix')
for p in i1.patches:
    i1.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
                ha = 'center', va = 'center', xytext = (0, 9), textcoords = 'offset points') 
In [28]:
import plotly.express as px 
In [29]:
dff= data.groupby(['Director']).size().reset_index(name='counts')
In [30]:
data.head(2)
Out[30]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0
In [31]:
data [ (data['Category']== 'Movie') &  (data['Type']=='Comedies')  | (data['Country'] == 'United Kingdom') ]
Out[31]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
19 s20 Movie '89 NaN Lee Dixon, Ian Wright, Paul Merson United Kingdom 2018-05-16 TV-PG 87 min Sports Movies Mixing old footage with interviews, this is th... 2018-05-16 2018.0
33 s34 Movie #realityhigh Fernando Lebrija Nesta Cooper, Kate Walsh, John Michael Higgins... United States 2017-09-08 TV-14 99 min Comedies When nerdy high schooler Dani finally attracts... 2017-09-08 2017.0
58 s59 TV Show 100% Hotter NaN Daniel Palmer, Melissa Sophia, Karen Williams,... United Kingdom 2019-11-01 TV-14 1 Season British TV Shows, International TV Shows, Real... A stylist, a hair designer and a makeup artist... 2019-11-01 2019.0
72 s73 Movie 17 Again Burr Steers Zac Efron, Leslie Mann, Matthew Perry, Thomas ... United States 2021-01-01 PG-13 102 min Comedies Nearing a midlife crisis, thirty-something Mik... 2021-01-01 2021.0
82 s83 Movie 2036 Origin Unknown Hasraf Dulull Katee Sackhoff, Ray Fearon, Julie Cox, Steven ... United Kingdom 2018-12-20 TV-14 95 min Sci-Fi & Fantasy Working with an artificial intelligence to inv... 2018-12-20 2018.0
... ... ... ... ... ... ... ... ... ... ... ... ... ...
7670 s7669 TV Show World War II in Colour NaN Robert Powell United Kingdom 2017-08-01 TV-MA 1 Season British TV Shows, Docuseries, International TV... Footage of the most dramatic moments from Worl... 2017-08-01 2017.0
7671 s7670 TV Show World's Busiest Cities NaN Anita Rani, Ade Adepitan, Dan Snow United Kingdom 2019-02-01 TV-PG 1 Season British TV Shows, Docuseries From Moscow to Mexico City, three BBC journali... 2019-02-01 2019.0
7688 s7687 Movie XV: Beyond the Tryline Pierre Deschamps NaN United Kingdom 2020-03-18 TV-14 91 min Documentaries, Sports Movies Set against the 2015 Rugby World Cup, this doc... 2020-03-18 2020.0
7725 s7724 Movie You Can Tutu James Brown Lily O'Regan, Jeannettsy Enriquez Borges, Joel... United Kingdom 2017-12-31 TV-G 87 min Children & Family Movies A gifted young ballet dancer struggles to find... 2017-12-31 2017.0
7740 s7739 TV Show Young Wallander NaN Adam Pålsson, Richard Dillane, Leanne Best, El... United Kingdom 2020-09-03 TV-MA 1 Season Crime TV Shows, International TV Shows, TV Dramas An incendiary hate crime stirs civil unrest, f... 2020-09-03 2020.0

485 rows × 13 columns

In [32]:
data_new = data.dropna()
In [33]:
data_new.head(2)
Out[33]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0
2 s3 Movie 23:59 Gilbert Chan Tedd Chan, Stella Chung, Henley Hii, Lawrence ... Singapore 2018-12-20 R 78 min Horror Movies, International Movies When an army recruit is found dead, his fellow... 2018-12-20 2018.0
In [34]:
data_new[data_new['Cast'].str.contains('Tom Cruise')] ## str.contain only work when there is no null values thats why we remove those in previous code 
Out[34]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
3860 s3861 Movie Magnolia Paul Thomas Anderson Jeremy Blackman, Tom Cruise, Melinda Dillon, A... United States 2020-01-01 R 189 min Dramas, Independent Movies Through chance, human action, past history and... 2020-01-01 2020.0
5071 s5071 Movie Rain Man Barry Levinson Dustin Hoffman, Tom Cruise, Valeria Golino, Ge... United States 2019-07-01 R 134 min Classic Movies, Dramas A fast-talking yuppie is forced to slow down w... 2019-07-01 2019.0
In [35]:
data[data['Cast'] == 'Tom Cruise']
Out[35]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
In [36]:
data.head(2)
Out[36]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0
In [37]:
data['Rating'].nunique()
Out[37]:
14
In [38]:
data['Rating'].unique()
Out[38]:
array(['TV-MA', 'R', 'PG-13', 'TV-14', 'TV-PG', 'NR', 'TV-G', 'TV-Y', nan,
       'TV-Y7', 'PG', 'G', 'NC-17', 'TV-Y7-FV', 'UR'], dtype=object)
In [39]:
data.head(2)
Out[39]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0

9.Finding Top Ten rating in Netflix¶

In [40]:
fig5=data['Rating'].value_counts().head(10).plot(kind='bar',title='Top ten Rating in Netflix')
for p in fig5.patches:
    fig5.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
                ha = 'center', va = 'center', xytext = (0, 9), textcoords = 'offset points')
In [41]:
data[(data['Category']=='Movie') & (data['Rating']== 'TV-14')].shape
Out[41]:
(1272, 13)
In [42]:
data[(data['Category']=='Movie') & (data['Rating']== 'TV-14') & (data['Country'] == 'Canada')].shape
Out[42]:
(11, 13)
In [43]:
data.head(2)
Out[43]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0
In [44]:
data[(data['Category']=='TV Show') & (data['Rating']== 'R') & (data["Year"]>2018)]
Out[44]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
6437 s6436 TV Show The Hateful Eight: Extended Version Quentin Tarantino Samuel L. Jackson, Kurt Russell, Jennifer Jaso... NaN 2019-04-25 R 1 Season TV Shows Trapped at a stagecoach stop as a storm rages ... 2019-04-25 2019.0
In [45]:
data.head(2)
Out[45]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0
In [46]:
data.Duration.dtypes
Out[46]:
dtype('O')
In [47]:
data.head(2)
Out[47]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0
In [48]:
data_tvshow = data[data['Category'] =='TV Show']
In [49]:
data_tvshow.head(2)
Out[49]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0
5 s6 TV Show 46 Serdar Akar Erdal Beşikçioğlu, Yasemin Allen, Melis Birkan... Turkey 2017-07-01 TV-MA 1 Season International TV Shows, TV Dramas, TV Mysteries A genetics professor experiments with a treatm... 2017-07-01 2017.0

10.Highest number of TV Show by Country¶

In [50]:
fig6=data_tvshow.Country.value_counts().head(5).plot(kind='bar',title='Highest number of TV Show by country')
for p in fig6.patches:
    fig6.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
                ha = 'center', va = 'center', xytext = (0, 9), textcoords = 'offset points')
In [51]:
data_tvshow.Country.value_counts().head(1)
Out[51]:
United States    705
Name: Country, dtype: int64
In [52]:
data.head(2)
Out[52]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0
In [53]:
data.sort_values(by = 'Year').head(5)
Out[53]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
7115 s7114 Movie To and From New York Sorin Dan Mihalcescu Barbara King, Shaana Diya, John Krisiukenas, Y... United States 2008-01-01 TV-MA 81 min Dramas, Independent Movies, Thrillers While covering a story in New York City, a Sea... 2008-01-01 2008.0
1765 s1766 TV Show Dinner for Five NaN NaN United States 2008-02-04 TV-MA 1 Season Stand-Up Comedy & Talk Shows In each episode, four celebrities join host Jo... 2008-02-04 2008.0
3248 s3249 Movie Just Another Love Story Ole Bornedal Anders W. Berthelsen, Rebecka Hemse, Nikolaj L... Denmark 2009-05-05 TV-MA 104 min Dramas, International Movies When he causes a car accident that leaves a yo... 2009-05-05 2009.0
5766 s5766 Movie Splatter Joe Dante Corey Feldman, Tony Todd, Tara Leigh, Erin Way... United States 2009-11-18 TV-MA 29 min Horror Movies After committing suicide, a washed-up rocker r... 2009-11-18 2009.0
3840 s3841 Movie Mad Ron's Prevues from Hell Jim Monaco Nick Pawlow, Jordu Schell, Jay Kushwara, Micha... United States 2010-11-01 NR 84 min Cult Movies, Horror Movies This collection cherry-picks trailers, forgott... 2010-11-01 2010.0
In [54]:
data.sort_values(by = 'Year', ascending = False).head(2)
Out[54]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
5564 s5564 Movie Sherlock Holmes Guy Ritchie Robert Downey Jr., Jude Law, Rachel McAdams, M... United States, Germany, United Kingdom, Australia 2021-01-01 PG-13 128 min Action & Adventure, Comedies The game is afoot for an eccentric detective w... 2021-01-01 2021.0
5919 s5919 Movie Surf's Up Ash Brannon, Chris Buck Shia LaBeouf, Jeff Bridges, Zooey Deschanel, J... United States 2021-01-01 PG 86 min Children & Family Movies, Comedies, Sports Movies This Oscar-nominated animated comedy goes behi... 2021-01-01 2021.0

11.Recent trend in Netflix¶

In [55]:
data.head(2)
Out[55]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0
In [56]:
data1=data.groupby(['Year','Category']).size().reset_index(name='Total Content').sort_values(by=['Year'],ascending=False).head(10)
data1
Out[56]:
Year Category Total Content
23 2021.0 TV Show 29
22 2021.0 Movie 88
21 2020.0 TV Show 697
20 2020.0 Movie 1312
19 2019.0 TV Show 656
18 2019.0 Movie 1497
17 2018.0 TV Show 430
16 2018.0 Movie 1255
15 2017.0 TV Show 361
14 2017.0 Movie 864
In [57]:
fig = px.line(data1, x="Year", y="Total Content",color='Category',title='Recent Trend of Movie and TV Show') #showing the recent trend in movie and tv shows 
fig.show()
fig.update_layout(template='plotly_dark')

12.Top ten cast¶

In [58]:
d1= data['Cast'].str.split(',',expand=True).stack() ## to split the cast name 
d2 =d1.to_frame() ## to form a dataframe
d2.columns=['Cast'] ## Naming the column as Cast
d3=d2.groupby(['Cast']).size().reset_index(name='Counts').sort_values(by=['Counts'],ascending=False) ## group by and sorted the value by counts 
d4= d3[d3['Cast'] != 'No cast specified']   ## removed the not specified cast 
d5=d4.sort_values(by=['Counts'],ascending=False).head(10)  ## for getting top ten cast 
d5
Out[58]:
Cast Counts
2321 Anupam Kher 38
27238 Takahiro Sakurai 28
34592 Shah Rukh Khan 27
21196 Om Puri 27
3758 Boman Irani 25
1692 Andrea Libman 24
21481 Paresh Rawal 24
29991 Yuki Kaji 23
29025 Vincent Tong 22
30588 Akshay Kumar 22
In [59]:
fig = px.bar(d5, x='Cast', y='Counts',text_auto=True, title = 'Top Ten cast on Netflix')
fig.show() 
fig.update_layout(template='plotly_dark')  ## for dark plotly theme 

13.Top Ten Types of content on netflix¶

In [60]:
t1= data['Type'].str.split(',',expand=True).stack() #calculating the movie and tv shows where listed in 
t1 = t1.to_frame()
t1.columns= ['Types of Content']
t2= t1.groupby(['Types of Content']).size().reset_index(name='Counts').sort_values(by=['Counts'],ascending=False)
t3=t2.head(10)
t3
t2
Out[60]:
Types of Content Counts
13 International Movies 2323
49 Dramas 1384
44 Comedies 1074
47 Documentaries 751
9 Dramas 722
... ... ...
59 Romantic Movies 3
62 Spanish-Language TV Shows 2
63 Sports Movies 1
70 TV Sci-Fi & Fantasy 1
55 LGBTQ Movies 1

73 rows × 2 columns

In [61]:
fig = px.pie(t3, values='Counts', names='Types of Content', title='Top Ten Types of content on netflix ')
fig.show()

14.Finding the highest and minimum duration of movie and tv show seasons and their name¶

In [62]:
data.head(2)
Out[62]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0
In [63]:
# Extract the numeric part from the 'duration' column
data['numeric_duration'] = data['Duration'].apply(lambda x: int(''.join(filter(str.isdigit, x))))

data.head(2)
Out[63]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year numeric_duration
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0 4
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0 93
In [64]:
## Max Season of a TV show
data[data['Category']=='TV Show'].numeric_duration.max()
Out[64]:
16
In [65]:
data [ (data['Category']== 'TV Show') &  (data['numeric_duration']== 16) ] [['Title','Description','Cast']]
Out[65]:
Title Description Cast
2538 Grey's Anatomy Intern (and eventual resident) Meredith Grey f... Ellen Pompeo, Sandra Oh, Katherine Heigl, Just...
In [66]:
## max Duration of a Movie
data[data['Category']=='Movie'].numeric_duration.max()
Out[66]:
312
In [67]:
##  minimum Duration of a Movie
data[data['Category']=='Movie'].numeric_duration.min()
Out[67]:
3
In [68]:
### Short Movie Tittle and Director
data [ (data['Category']== 'Movie') &  (data['numeric_duration']== 3) ] [['Title','Description','Director']]
Out[68]:
Title Description Director
5606 Silent "Silent" is an animated short film created by ... Limbert Fabian, Brandon Oldenburg

Monthwise released on netflix¶

In [69]:
data.head(2)
Out[69]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year numeric_duration
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0 4
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0 93
In [70]:
data['Month'] = data['Date_N'].dt.month
data.head(2)
Out[70]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year numeric_duration Month
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0 4 8.0
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0 93 12.0
In [71]:
e1= data['Month']
e1 = e1.to_frame()
e1.columns= ['Month Number']
e2= e1.groupby(['Month Number']).size().reset_index(name='Count').sort_values(by=['Count'],ascending=False)
e2
Out[71]:
Month Number Count
11 12.0 833
9 10.0 785
0 1.0 757
10 11.0 738
2 3.0 669
8 9.0 619
7 8.0 618
3 4.0 601
6 7.0 600
4 5.0 543
5 6.0 542
1 2.0 472
In [72]:
fig = px.bar(e2, x='Month Number', y='Count',text_auto=True, title = 'Top Ten cast on Netflix')
fig.show()
In [73]:
data.head(2)
Out[73]:
Show_Id Category Title Director Cast Country Release_Date Rating Duration Type Description Date_N Year numeric_duration Month
0 s1 TV Show 3% NaN João Miguel, Bianca Comparato, Michel Gomes, R... Brazil 2020-08-14 TV-MA 4 Seasons International TV Shows, TV Dramas, TV Sci-Fi &... In a future where the elite inhabit an island ... 2020-08-14 2020.0 4 8.0
1 s2 Movie 07:19 Jorge Michel Grau Demián Bichir, Héctor Bonilla, Oscar Serrano, ... Mexico 2016-12-23 TV-MA 93 min Dramas, International Movies After a devastating earthquake hits Mexico Cit... 2016-12-23 2016.0 93 12.0
In [74]:
data.Country.value_counts().nlargest(3).sum()/len(data)*100
Out[74]:
49.762424553743415
In [75]:
data.Country.value_counts().nlargest(10).sum()/len(data)*100
Out[75]:
63.06664954411198

Conclusion of This Netflix datasets¶

1.Data description and processing

1.1 There was total 11 columns with 7789 entries
1.2 Data Duplication is removed
1.3 Null value is removed
1.4 Data contained from 2008-01-01 to 2021-01-16

2. Exploratory Analysis¶

2.1 In 2019 most Movie and TV Show released
2.2 There is released more Movie than TV Show about 68%
2.3 Both Raul Campos Jan Suter directed 18 highest in the dataset
2.4 In this dataset only two movies are casted by Tom Cruise
2.6 There is 14 different ratings in Netflix
2.7 Most Rating are TV-MA, count is 2863
2.8 TV-MA and TV-14 rating is about 50% in this datasets
2.9 Highest Number of TV Show is released from USA ,more than 3 times than second country UK
2.10 Top casted actor is Aupam Kher total 38
2.11 24.5% Content are international Movies
2.12 International Movies ,drama and comedy are 50% content of the Netflix
2.13 Max Season of a TV Show is 16 and Max Duration of a movie is 312 minutes ,minimum is 3 minutes ,Movie tittle 'Silent'

3. Trend Analysis¶

3.1 recent trend shows that TV Show are more releasing than Movie
3.2 Oct,Nov,Dec and January is the month where 40% of the content released
3.3 December is the month where highest content is released
3.4 Top three country released about 50% of the TV Show and Movies it increses 68% when we choose Top ten Countries